Goto

Collaborating Authors

 Coquimbo Region


Dark Energy Survey Year 3 results: Simulation-based $w$CDM inference from weak lensing and galaxy clustering maps with deep learning. I. Analysis design

Thomsen, A., Bucko, J., Kacprzak, T., Ajani, V., Fluri, J., Refregier, A., Anbajagane, D., Castander, F. J., Ferté, A., Gatti, M., Jeffrey, N., Alarcon, A., Amon, A., Bechtol, K., Becker, M. R., Bernstein, G. M., Campos, A., Rosell, A. Carnero, Chang, C., Chen, R., Choi, A., Crocce, M., Davis, C., DeRose, J., Dodelson, S., Doux, C., Eckert, K., Elvin-Poole, J., Everett, S., Fosalba, P., Gruen, D., Harrison, I., Herner, K., Huff, E. M., Jarvis, M., Kuropatkin, N., Leget, P. -F., MacCrann, N., McCullough, J., Myles, J., Navarro-Alsina, A., Pandey, S., Porredon, A., Prat, J., Raveri, M., Rodriguez-Monroy, M., Rollins, R. P., Roodman, A., Rykoff, E. S., Sánchez, C., Secco, L. F., Sheldon, E., Shin, T., Troxel, M. A., Tutusaus, I., Varga, T. N., Weaverdyck, N., Wechsler, R. H., Yanny, B., Yin, B., Zhang, Y., Zuntz, J., Allam, S., Andrade-Oliveira, F., Bacon, D., Blazek, J., Brooks, D., Camilleri, R., Carretero, J., Cawthon, R., da Costa, L. N., Pereira, M. E. da Silva, Davis, T. M., De Vicente, J., Desai, S., Doel, P., García-Bellido, J., Gutierrez, G., Hinton, S. R., Hollowood, D. L., Honscheid, K., James, D. J., Kuehn, K., Lahav, O., Lee, S., Marshall, J. L., Mena-Fernández, J., Menanteau, F., Miquel, R., Muir, J., Ogando, R. L. C., Malagón, A. A. Plazas, Sanchez, E., Cid, D. Sanchez, Sevilla-Noarbe, I., Smith, M., Suchyta, E., Swanson, M. E. C., Thomas, D., To, C., Tucker, D. L.

arXiv.org Artificial Intelligence

Data-driven approaches using deep learning are emerging as powerful techniques to extract non-Gaussian information from cosmological large-scale structure. This work presents the first simulation-based inference (SBI) pipeline that combines weak lensing and galaxy clustering maps in a realistic Dark Energy Survey Year 3 (DES Y3) configuration and serves as preparation for a forthcoming analysis of the survey data. We develop a scalable forward model based on the CosmoGridV1 suite of N-body simulations to generate over one million self-consistent mock realizations of DES Y3 at the map level. Leveraging this large dataset, we train deep graph convolutional neural networks on the full survey footprint in spherical geometry to learn low-dimensional features that approximately maximize mutual information with target parameters. These learned compressions enable neural density estimation of the implicit likelihood via normalizing flows in a ten-dimensional parameter space spanning cosmological $w$CDM, intrinsic alignment, and linear galaxy bias parameters, while marginalizing over baryonic, photometric redshift, and shear bias nuisances. To ensure robustness, we extensively validate our inference pipeline using synthetic observations derived from both systematic contaminations in our forward model and independent Buzzard galaxy catalogs. Our forecasts yield significant improvements in cosmological parameter constraints, achieving $2-3\times$ higher figures of merit in the $Ω_m - S_8$ plane relative to our implementation of baseline two-point statistics and effectively breaking parameter degeneracies through probe combination. These results demonstrate the potential of SBI analyses powered by deep learning for upcoming Stage-IV wide-field imaging surveys.


A Composite-Loss Graph Neural Network for the Multivariate Post-Processing of Ensemble Weather Forecasts

Lakatos, Mária

arXiv.org Machine Learning

Ensemble forecasting systems have advanced meteorology by providing probabilistic estimates of future states, supporting applications from renewable energy production to transportation safety. Nonetheless, systematic biases often persist, making statistical post-processing essential. Traditional parametric post-processing techniques and machine learning-based methods can produce calibrated predictive distributions at specific locations and lead times, yet often struggle to capture dependencies across forecast dimensions. To address this, multivariate post-processing methods-such as ensemble copula coupling and the Schaake shuffle-are widely applied in a second step to restore realistic inter-variable or spatio-temporal dependencies. The aim of this study is the multivariate post-processing of ensemble forecasts using a graph neural network (dualGNN) trained with a composite loss function that combines the energy score (ES) and the variogram score (VS). The method is evaluated on two datasets: WRF-based solar irradiance forecasts over northern Chile and ECMWF visibility forecasts for Central Europe. The dualGNN consistently outperforms all empirical copula-based post-processed forecasts and shows significant improvements compared to graph neural networks trained solely on either the continuous ranked probability score (CRPS) or the ES, according to the evaluated multivariate verification metrics. Furthermore, for the WRF forecasts, the rank-order structure of the dualGNN forecasts captures valuable dependency information, enabling a more effective restoration of spatial relationships than either the raw numerical weather prediction ensemble or historical observational rank structures. By contrast, for the visibility forecasts, the GNNs trained on CRPS, ES, or the ES-VS combination outperform the calibrated reference.


Learning Efficient and Generalizable Graph Retriever for Knowledge-Graph Question Answering

Yao, Tianjun, Li, Haoxuan, Shen, Zhiqiang, Li, Pan, Liu, Tongliang, Zhang, Kun

arXiv.org Artificial Intelligence

Large Language Models (LLMs) have shown strong inductive reasoning ability across various domains, but their reliability is hindered by the outdated knowledge and hallucinations. Retrieval-Augmented Generation mitigates these issues by grounding LLMs with external knowledge; however, most existing RAG pipelines rely on unstructured text, limiting interpretability and structured reasoning. Knowledge graphs, which represent facts as relational triples, offer a more structured and compact alternative. Recent studies have explored integrating knowledge graphs with LLMs for knowledge graph question answering (KGQA), with a significant proportion adopting the retrieve-then-reasoning paradigm. In this framework, graph-based retrievers have demonstrated strong empirical performance, yet they still face challenges in generalization ability. In this work, we propose RAPL, a novel framework for efficient and effective graph retrieval in KGQA. RAPL addresses these limitations through three aspects: (1) a two-stage labeling strategy that combines heuristic signals with parametric models to provide causally grounded supervision; (2) a model-agnostic graph transformation approach to capture both intra- and inter-triple interactions, thereby enhancing representational capacity; and (3) a path-based reasoning strategy that facilitates learning from the injected rational knowledge, and supports downstream reasoner through structured inputs. Empirically, RAPL outperforms state-of-the-art methods by $2.66\%-20.34\%$, and significantly reduces the performance gap between smaller and more powerful LLM-based reasoners, as well as the gap under cross-dataset settings, highlighting its superior retrieval capability and generalizability. Codes are available at: https://github.com/tianyao-aka/RAPL.


Identifying Doppelganger Active Galactic Nuclei across redshifts from spectroscopic surveys

Sareen, Shreya, Panda, Swayamtrupta

arXiv.org Artificial Intelligence

Active Galactic Nuclei (AGNs) are among the most luminous objects in the universe, making them valuable probes for studying galaxy evolution. However, understanding how AGN properties evolve over cosmic time remains a fundamental challenge. This study investigates whether AGNs at low redshift (nearby) can serve as proxies for their high-redshift (distant) counterparts by identifying spectral 'doppelgängers', AGNs with remarkably similar emission line properties despite being separated by vast cosmic distances. We analyze key spectral features of bona fide AGNs using the Sloan Digital Sky Survey's Data Release 16, including continuum and emission lines: Nitrogen (N V), Carbon (C IV), Magnesium (Mg II), Hydrogen-beta (H$β$), and Iron (Fe II - optical and UV) emission lines. We incorporated properties such as equivalent width, velocity dispersion in the form of full width at half maximum (FWHM), and continuum luminosities (135nm, 300nm, and 510nm) closest to these prominent lines. Our initial findings suggest the existence of multiple AGNs with highly similar spectra, hinting at the possibility that local AGNs may indeed share intrinsic properties with high-redshift ones. We showcase here one of the better candidate pairs of AGNs resulting from our analyses.


Machine learning-based probabilistic forecasting of solar irradiance in Chile

Baran, Sándor, Marín, Julio C., Cuevas, Omar, Díaz, Mailiu, Szabó, Marianna, Nicolis, Orietta, Lakatos, Mária

arXiv.org Machine Learning

By the end of 2023, renewable sources cover 63.4% of the total electric power demand of Chile, and in line with the global trend, photovoltaic (PV) power shows the most dynamic increase. Although Chile's Atacama Desert is considered the sunniest place on Earth, PV power production, even in this area, can be highly volatile. Successful integration of PV energy into the country's power grid requires accurate short-term PV power forecasts, which can be obtained from predictions of solar irradiance and related weather quantities. Nowadays, in weather forecasting, the state-of-the-art approach is the use of ensemble forecasts based on multiple runs of numerical weather prediction models. However, ensemble forecasts still tend to be uncalibrated or biased, thus requiring some form of post-processing. The present work investigates probabilistic forecasts of solar irradiance for Regions III and IV in Chile. For this reason, 8-member short-term ensemble forecasts of solar irradiance for calendar year 2021 are generated using the Weather Research and Forecasting (WRF) model, which are then calibrated using the benchmark ensemble model output statistics (EMOS) method based on a censored Gaussian law, and its machine learning-based distributional regression network (DRN) counterpart. Furthermore, we also propose a neural network-based post-processing method resulting in improved 8-member ensemble predictions. All forecasts are evaluated against station observations for 30 locations, and the skill of post-processed predictions is compared to the raw WRF ensemble. Our case study confirms that all studied post-processing methods substantially improve both the calibration of probabilistic- and the accuracy of point forecasts. Among the methods tested, the corrected ensemble exhibits the best overall performance. Additionally, the DRN model generally outperforms the corresponding EMOS approach.


The Palomar twilight survey of 'Ayl\'o'chaxnim, Atiras, and comets

Bolin, B. T., Masci, F. J., Coughlin, M. W., Duev, D. A., Ivezić, Ž., Jones, R. L., Yoachim, P., Ahumada, T., Bhalerao, V., Choudhary, H., Contreras, C., Cheng, Y. -C., Copperwheat, C. M., Deshmukh, K., Fremling, C., Granvik, M., Hardegree-Ullman, K. K., Ho, A. Y. Q., Jedicke, R., Kasliwal, M., Kumar, H., Lin, Z. -Y., Mahabal, A., Monson, A., Neill, J. D., Nesvorný, D., Perley, D. A., Purdum, J. N., Quimby, R., Serabyn, E., Sharma, K., Swain, V.

arXiv.org Artificial Intelligence

Near-sun sky twilight observations allow for the detection of asteroid interior to the orbit of Venus (Aylos), the Earth (Atiras), and comets. We present the results of observations with the Palomar 48-inch telescope (P48)/Zwicky Transient Facility (ZTF) camera in 30 s r-band exposures taken during evening astronomical twilight from 2019 Sep 20 to 2022 March 7 and during morning astronomical twilight sky from 2019 Sep 21 to 2022 Sep 29. More than 46,000 exposures were taken in evening and morning astronomical twilight within 31 to 66 degrees from the Sun with an r-band limiting magnitude between 18.1 and 20.9. The twilight pointings show a slight seasonal dependence in limiting magnitude and ability to point closer towards the Sun, with limiting magnitude slightly improving during summer. In total, the one Aylo, (594913) 'Ayl\'o'chaxnim, and 4 Atiras, 2020 OV1, 2021 BS1, 2021 PB2, and 2021 VR3, were discovered in evening and morning twilight observations. Additional twilight survey discoveries also include 6 long-period comets: C/2020 T2, C/2020 V2, C/2021 D2, C/2021 E3, C/2022 E3, and C/2022 P3, and two short-period comets: P/2021 N1 and P/2022 P2 using deep learning comet detection pipelines. The P48/ZTF twilight survey also recovered 11 known Atiras, one Aylo, three short-period comes, two long-period comets, and one interstellar object. Lastly, the Vera Rubin Observatory will conduct a twilight survey starting in its first year of operations and will cover the sky within 45 degrees of the Sun. Twilight surveys such as those by ZTF and future surveys will provide opportunities for discovering asteroids inside the orbits of Earth and Venus.


Deep Learning Driven Detection of Tsunami Related Internal GravityWaves: a path towards open-ocean natural hazards detection

Constantinou, Valentino, Ravanelli, Michela, Liu, Hamlin, Bortnik, Jacob

arXiv.org Artificial Intelligence

Tsunamis can trigger internal gravity waves (IGWs) in the ionosphere, perturbing the Total Electron Content (TEC) - referred to as Traveling Ionospheric Disturbances (TIDs) that are detectable through the Global Navigation Satellite System (GNSS). The GNSS are constellations of satellites providing signals from Earth orbit - Europe's Galileo, the United States' Global Positioning System (GPS), Russia's Global'naya Navigatsionnaya Sputnikovaya Sistema (GLONASS) and China's BeiDou. The real-time detection of TIDs provides an approach for tsunami detection, enhancing early warning systems by providing open-ocean coverage in geographic areas not serviceable by buoy-based warning systems. Large volumes of the GNSS data is leveraged by deep learning, which effectively handles complex non-linear relationships across thousands of data streams. We describe a framework leveraging slant total electron content (sTEC) from the VARION (Variometric Approach for Real-Time Ionosphere Observation) algorithm by Gramian Angular Difference Fields (from Computer Vision) and Convolutional Neural Networks (CNNs) to detect TIDs in near-real-time. Historical data from the 2010 Maule, 2011 Tohoku and the 2012 Haida-Gwaii earthquakes and tsunamis are used in model training, and the later-occurring 2015 Illapel earthquake and tsunami in Chile for out-of-sample model validation. Using the experimental framework described in the paper, we achieved a 91.7% F1 score. Source code is available at: https://github.com/vc1492a/tidd. Our work represents a new frontier in detecting tsunami-driven IGWs in open-ocean, dramatically improving the potential for natural hazards detection for coastal communities.


Active learning with RESSPECT: Resource allocation for extragalactic astronomical transients

Kennamer, Noble, Ishida, Emille E. O., Gonzalez-Gaitan, Santiago, de Souza, Rafael S., Ihler, Alexander, Ponder, Kara, Vilalta, Ricardo, Moller, Anais, Jones, David O., Dai, Mi, Krone-Martins, Alberto, Quint, Bruno, Sreejith, Sreevarsha, Malz, Alex I., Galbany, Lluis

arXiv.org Artificial Intelligence

The recent increase in volume and complexity of available astronomical data has led to a wide use of supervised machine learning techniques. Active learning strategies have been proposed as an alternative to optimize the distribution of scarce labeling resources. However, due to the specific conditions in which labels can be acquired, fundamental assumptions, such as sample representativeness and labeling cost stability cannot be fulfilled. The Recommendation System for Spectroscopic follow-up (RESSPECT) project aims to enable the construction of optimized training samples for the Rubin Observatory Legacy Survey of Space and Time (LSST), taking into account a realistic description of the astronomical data environment. In this work, we test the robustness of active learning techniques in a realistic simulated astronomical data scenario. Our experiment takes into account the evolution of training and pool samples, different costs per object, and two different sources of budget. Results show that traditional active learning strategies significantly outperform random sampling. Nevertheless, more complex batch strategies are not able to significantly overcome simple uncertainty sampling techniques. Our findings illustrate three important points: 1) active learning strategies are a powerful tool to optimize the label-acquisition task in astronomy, 2) for upcoming large surveys like LSST, such techniques allow us to tailor the construction of the training sample for the first day of the survey, and 3) the peculiar data environment related to the detection of astronomical transients is a fertile ground that calls for the development of tailored machine learning algorithms.


A VIKOR and TOPSIS focused reanalysis of the MADM methods based on logarithmic normalization

Zolfani, Sarfaraz, Yazdani, Morteza, Pamucar, Dragan, Zaraté, Pascale

arXiv.org Artificial Intelligence

Decision and policy-makers in multi-criteria decision-making analysis take into account some strategies in order to analyze outcomes and to finally make an effective and more precise decision. Among those strategies, the modification of the normalization process in the multiple-criteria decision-making algorithm is still a question due to the confrontation of many normalization tools. Normalization is the basic action in defining and solving a MADM problem and a MADM model. Normalization is the first, also necessary, step in solving, i.e. the application of a MADM method. It is a fact that the selection of normalization methods has a direct effect on the results. One of the latest normalization methods introduced is the Logarithmic Normalization (LN) method. This new method has a distinguished advantage, reflecting in that a sum of the normalized values of criteria always equals 1. This normalization method had never been applied in any MADM methods before. This research study is focused on the analysis of the classical MADM methods based on logarithmic normalization. VIKOR and TOPSIS, as the two famous MADM methods, were selected for this reanalysis research study. Two numerical examples were checked in both methods, based on both the classical and the novel ways based on the LN. The results indicate that there are differences between the two approaches. Eventually, a sensitivity analysis is also designed to illustrate the reliability of the final results.


Scientists are using satellites to spot stranded whales from SPACE

Daily Mail - Science & tech

Satellites could help locate stranded whales more efficiently and in real-time. Scientists have begun harnessing the power of the technology's high-resolution imagery to detect and monitor whales stranded on the shore from space. The team noted that the use of satellites will help find stranded whales in remote locations, as well as spot potentially deteriorating ocean conditions. Satellites could help locate stranded whales more efficiently and in real-time. Scientists have begun harnessing the power of the technology's high-resolution imagery to detect and monitor whales stranded on the shore from space Chile witnessed one of the largest mass mortality of baleen whales in 2015 on the remote beaches of Patagonia – at least 343 died.